Using Skipgrams, Bigrams, and Part of Speech Features for Sentiment Classification of Twitter Messages
نویسندگان
چکیده
In this paper, we consider the problem of sentiment classification of English Twitter messages using machine learning techniques. We systematically evaluate the use of different feature types on the performance of two text classification methods: Naive Bayes (NB) and Support Vector Machines (SVM). Our goal is threefold: (1) to investigate whether or not partof-speech (POS) features are useful for this task, (2) to study the effectiveness of sparse phrasal features (bigrams and skipgrams) to capture sentiment information, and (3) to investigate the impact of combining unigrams with phrasal features on the classification’s performance. For this purpose we conducted a series of classification experiments. Our results show that POS features are useful for this task while phrasal features could improve the performance of the classification only when combined with unigrams.
منابع مشابه
A High-Performance Model based on Ensembles for Twitter Sentiment Classification
Background and Objectives: Twitter Sentiment Classification is one of the most popular fields in information retrieval and text mining. Millions of people of the world intensity use social networks like Twitter. It supports users to publish tweets to tell what they are thinking about topics. There are numerous web sites built on the Internet presenting Twitter. The user can enter a sentiment ta...
متن کاملGPLSI: Supervised Sentiment Analysis in Twitter using Skipgrams
In this paper we describe the system submitted for the SemEval 2014 Task 9 (Sentiment Analysis in Twitter) Subtask B. Our contribution consists of a supervised approach using machine learning techniques, which uses the terms in the dataset as features. In this work we do not employ any external knowledge and resources. The novelty of our approach lies in the use of words, ngrams and skipgrams (...
متن کاملSentiment Classification of Social Media Content with Features Generated Using Topic Models
This paper presents a method for using topic distributions generated from topic models as features for performing sentiment analysis on documents. This will be tested in the social media domain, specifically Twitter. The proposed approach allows for the mapping from word space to topic space which allows for less features to be needed and also reduces computational complexity. Multiple machine ...
متن کاملMHSubLex: Using Metaheuristic Methods for Subjectivity Classification of Microblogs
In Web 2.0, people are free to share their experiences, views, and opinions. One of the problems that arises in web 2.0 is the sentiment analysis of texts produced by users in outlets such as Twitter. One of main the tasks of sentiment analysis is subjectivity classification. Our aim is to classify the subjectivity of Tweets. To this end, we create subjectivity lexicons in which the words into ...
متن کاملA Supervised Approach for Sentiment Analysis using Skipgrams
We present a supervised hybrid approach for Sentiment Analysis in Twitter. A sentiment lexicon is built from a dataset, where each tweet is labelled with its overall polarity. In this work, skipgrams are used as information units (in addition to words and n-grams) to enrich the sentiment lexicon with combinations of words that are not adjacent in the text. This lexicon is employed in conjunctio...
متن کامل